A comparative study of spectral transformation techniques for singing voice synthesis
نویسندگان
چکیده
Studies show that professional singing matches well the associated melody and typically exhibits spectra different from speech in resonance tuning and singing formant. Therefore, one of the important topics in speech-to-singing conversion is to characterize the spectral transformation between speech and singing. This paper extends two types of spectral transformation techniques, namely voice conversion and model adaptation, and examines their performance. For the first time, we carry out a comparative study over four singing voice synthesis techniques. The experiments on various data sizes reveal that maximumlikelihood Gaussian mixture model (ML-GMM) of voice conversion always delivers the best performance in terms of spectral estimation accuracy; while model adaptation generates the best singing quality in all cases. When a large dataset is available, both techniques achieve the highest similarity to target singing. With a small dataset, the highest similarity is obtained by ML-GMM. It is also found that the music context-dependent modeling in adaptation, in which detailed partition of transform space is involved, leads to pleasant singing spectra.
منابع مشابه
Control Methods of Acoustic Parameters for Singing-Voice Synthesis
To construct a natural singing voice synthesis system, it is important to adequately control acoustic parameters such as fundamental frequency (F0), phoneme duration, and spectrum information in the synthesis method, based on comparative analysis between spokenand singing-voices. This paper proposes a transformation from read speech into singing-voice using STRAIGHT. This method is composed of ...
متن کاملAnalysis of acoustic features affecting "singing-ness" and its application to singing-voice synthesis from speaking-voice
To construct a natural singing-voice synthesis system, it is important to adequately control acoustic features such as fundamental frequency (F0), spectrum shapes, and phoneme duration in the synthesis method. This paper reveals acoustic features affecting singing-voice perception by comparative analyzing singingand speaking-voices, and then proposes a transforming method from speaking-voice in...
متن کاملImprovements to a Sample-Concatenation Based Singing Voice Synthesizer
This paper describes recent improvements to our singing voice synthesizer based on concatenation and transformation of audio samples using spectral models. Improvements include firstly robust automation of previous singer database creation process, a lengthy and tedious task which involved recording scripts generation, studio sessions, audio editing, spectral analysis, and phonetic based segmen...
متن کاملEmulating Rough and Growl Voice in Spectral Domain
This paper presents a new approach on transforming a modal voice into a rough or growl voice. The goal of such transformations is to be able to enhance voice expressiveness in singing voice productions. Both techniques work with spectral models and are based on adding sub-harmonics in frequency domain to the original input voice spectrum.
متن کاملSinging Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope
Pleasant singing voice is often ornamented by vibrato. This pitch fluctuation acts as a distinctive feature for singing and promotes voice quality. Nevertheless, independent pitch processing in singing voice synthesis does not guarantee the output quality. The spectral envelope actually varies with pitch during human voice production. This paper proposes a modeling technique for singers’ vibrat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014